Goto

Collaborating Authors

 bandwidth matrix


Adaptive Kernel Density Estimation with Pre-training

arXiv.org Machine Learning

Density estimation in high-dimensional settings is an important and challenging statistical problem.Traditional methods based on kernel smoothing are inefficient in high dimensions due to the difficulties in specifying appropriate location-adaptive kernels. In this work, we introduce pre-training, a key idea behind many cutting-edge AI technologies, to the context of non-parametric density estimation. By establishing a pre-trained neural network that can recommend an appropriate location-adaptive kernel for each sample point, efficient density estimation with adaptive kernels is achieved in high dimensions. A wide range of numerical experiments show that this strategy is highly effective for improving density-estimation accuracy, when the target distribution is close to the distribution family for pre-training. When the target distribution is substantially different from the pre-training distribution family, the benefit from the proposed pre-training strategy may be diluted, but can be reactivated by an additional fine-tuning procedure.


Kernel Density Estimation by Genetic Algorithm

arXiv.org Machine Learning

This study proposes a data condensation method for multivariate kernel density estimation by genetic algorithm. First, our proposed algorithm generates multiple subsamples of a given size with replacement from the original sample. The subsamples and their constituting data points are regarded as $\it{chromosome}$ and $\it{gene}$, respectively, in the terminology of genetic algorithm. Second, each pair of subsamples breeds two new subsamples, where each data point faces either $\it{crossover}$, $\it{mutation}$, or $\it{reproduction}$ with a certain probability. The dominant subsamples in terms of fitness values are inherited by the next generation. This process is repeated generation by generation and brings the sparse representation of kernel density estimator in its completion. We confirmed from simulation studies that the resulting estimator can perform better than other well-known density estimators.


Kernel Density Estimation by Stagewise Algorithm with a Simple Dictionary

arXiv.org Machine Learning

This study proposes multivariate kernel density estimation by stagewise minimization algorithm based on $U$-divergence and a simple dictionary. The dictionary consists of an appropriate scalar bandwidth matrix and a part of the original data. The resulting estimator brings us data-adaptive weighting parameters and bandwidth matrices, and realizes a sparse representation of kernel density estimation. We develop the non-asymptotic error bound of estimator obtained via the proposed stagewise minimization algorithm. It is confirmed from simulation studies that the proposed estimator performs competitive to or sometime better than other well-known density estimators.


Constrained Sampling from a Kernel Density Estimator to Generate Scenarios for the Assessment of Automated Vehicles

arXiv.org Artificial Intelligence

The safety assessment of automated vehicles (AVs) is an important aspect of the development cycle of AVs. A scenario-based assessment approach is accepted by many players in the field as part of the complete safety assessment. A scenario is a representation of a situation on the road to which the AV needs to respond appropriately. One way to generate the required scenario-based test descriptions is to parameterize the scenarios and to draw these parameters from a probability density function (pdf). Because the shape of the pdf is unknown beforehand, assuming a functional form of the pdf and fitting the parameters to the data may lead to inaccurate fits. As an alternative, Kernel Density Estimation (KDE) is a promising candidate for estimating the underlying pdf, because it is flexible with the underlying distribution of the parameters. Drawing random samples from a pdf estimated with KDE is possible without the need of evaluating the actual pdf, which makes it suitable for drawing random samples for, e.g., Monte Carlo methods. Sampling from a KDE while the samples satisfy a linear equality constraint, however, has not been described in the literature, as far as the authors know. In this paper, we propose a method to sample from a pdf estimated using KDE, such that the samples satisfy a linear equality constraint. We also present an algorithm of our method in pseudo-code. The method can be used to generating scenarios that have, e.g., a predetermined starting speed or to generate different types of scenarios. This paper also shows that the method for sampling scenarios can be used in case a Singular Value Decomposition (SVD) is used to reduce the dimension of the parameter vectors.


Forest Guided Smoothing

arXiv.org Machine Learning

Random forests are often an accurate method for nonparametric regression but they are notoriously difficult to interpret. Also, it is difficult to construct standard errors, confidence intervals and meaningful measures of variable importance. In this paper, we construct a spatially adaptive local linear smoother that approximates the forest. Our approach builds on the ideas in Bloniarz et al. (2016) and Friedberg et al. (2020). The main difference is that we define a one parameter family of bandwidth matrices which help with the construction of confidence intervals, and measures of variable importance. Our starting point is the well-known fact that a random forest can be regarded as a type of kernel smoother (Breiman (2000); Scornet (2016); Lin and Jeon (2006); Geurts et al. (2006); Hothorn et al. (2004); Meinshausen (2006)). We take it as a given that the forest is an accurate predictor and we do not make any attempt to improve the method. Instead, we want to find a family of linear smoothers that approximate the forest. Then we show how to use this family for interpretation, bias correction, confidence intervals, variable importance and for exploring the structure of the forest.


Variable Kernel Density Estimation in High-Dimensional Feature Spaces

AAAI Conferences

Estimating the joint probability density function of a dataset is a central task in many machine learning applications. In this work we address the fundamental problem of kernel bandwidth estimation for variable kernel density estimation in high-dimensional feature spaces. We derive a variable kernel bandwidth estimator by minimizing the leave-one-out entropy objective function and show that this estimator is capable of performing estimation in high-dimensional feature spaces with great success. We compare the performance of this estimator to state-of-the art maximum-likelihood estimators on a number of representative high-dimensional machine learning tasks and show that the newly introduced minimum leave-one-out entropy estimator performs optimally on a number of high-dimensional datasets considered.


A comparison of bandwidth selectors for mean shift clustering

arXiv.org Machine Learning

We explore the performance of several automatic bandwidth selectors, originally designed for density gradient estimation, as data-based procedures for nonparametric, modal clustering. The key tool to obtain a clustering from density gradient estimators is the mean shift algorithm, which allows to obtain a partition not only of the data sample, but also of the whole space. The results of our simulation study suggest that most of the methods considered here, like cross validation and plug in bandwidth selectors, are useful for cluster analysis via the mean shift algorithm. Keywords: bandwidth selection, mean shift algorithm, modal clustering.


Data-driven density derivative estimation, with applications to nonparametric clustering and bump hunting

arXiv.org Machine Learning

Important information concerning a multivariate data set, such as clusters and modal regions, is contained in the derivatives of the probability density function. Despite this importance, nonparametric estimation of higher order derivatives of the density functions have received only relatively scant attention. Kernel estimators of density functions are widely used as they exhibit excellent theoretical and practical properties, though their generalization to density derivatives has progressed more slowly due to the mathematical intractabilities encountered in the crucial problem of bandwidth (or smoothing parameter) selection. This paper presents the first fully automatic, data-based bandwidth selectors for multivariate kernel density derivative estimators. This is achieved by synthesizing recent advances in matrix analytic theory which allow mathematically and computationally tractable representations of higher order derivatives of multivariate vector valued functions. The theoretical asymptotic properties as well as the finite sample behaviour of the proposed selectors are studied. {In addition, we explore in detail the applications of the new data-driven methods for two other statistical problems: clustering and bump hunting. The introduced techniques are combined with the mean shift algorithm to develop novel automatic, nonparametric clustering procedures which are shown to outperform mixture-model cluster analysis and other recent nonparametric approaches in practice. Furthermore, the advantage of the use of smoothing parameters designed for density derivative estimation for feature significance analysis for bump hunting is illustrated with a real data example.